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Partial resampling to approximate covering integer programs* * 
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Abstract 

We consider column-sparse positive covering integer programs, which generalize set cover and 
which have attracted a long line of research developing (randomized) approximation algorithms. 
We develop a new rounding scheme based on the Partial Resampling variant of the Lovasz 
Local Lemma developed by Harris & Srinivasan (2013). This achieves an approximation ratio 

of 1 + In ( Al+1 ) .(. 0 (./ i° s (Ai+ 1 ) ,) where a m i n is the minimum covering constraint and Ai is the 

Q-min v y Umin 7 

maximum £i-norm of any column of the covering matrix (whose entries are scaled to lie in [0,1]). 
When there are additional constraints on the sizes of the variables, we show an approximation 

ratio of 1 + Ch 1 °g( Al + 1 ) + . / Io g( Ai + Jh to satisfy these size constraints up to multiplicative 

®min£ V btmin 

factor 1 + e, or an approximation of ratio of In Ao + 0(y /log Ao) to satisfy the size constraints 
exactly (where Ao is the maximum number of non-zero entries in any column of the covering 
matrix). We also show nearly-matching inapproximability and integrality-gap lower bounds. 
These results improve asymptotically, in several different ways, over results shown by Srinivasan 
(2006) and Kolliopoulos & Young (2005). 

We show also that our algorithm automatically handles multi-criteria programs, efficiently 
achieving approximation ratios which are essentially equivalent to the single-criterion case and 
which apply even when the number of criteria is large. 


1 Introduction 


We consider positive covering integer programs - or simply covering integer programs (CIPs) - 
defined as follows (with Z + denoting the set of non-negative integers). We have solution variables 
x \,..., x n € Z + , and for k = 1,..., m, a system of m covering constraints of the form: 

^ ' AfciXi Y ak 
i 


Here Ak is an n-long non-negative vector; by scaling, we can assume that € [0,1] and ak > 1. 
We can write this more compactly as Ak ■ x > ak- We may optionally have constraints on the size 
of the solution variables, namely, that we require X{ € {0,1,... , d*}; these are referred to as the 
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multiplicity constraints. Finally, we have some linear objective function, represented by a vector 
C £ [0, oo) n . Our goal is to minimize C ■ x, subject to the multiplicity and covering constraints. 

This generalizes the set-cover problem, which can be viewed as a special case in which ak = 1, A^i € 
{0,1}. Solving set cover or integer programs exactly is NP-hard [12], so a common strategy is to 
obtain a solution which is approximately optimal. There are at least three ways one may obtain 
an approximate solution, where OPT denotes the optimal solution-value for the given instance: 

1. the solution x may violate the optimality constraint, that is, C ■ x > OPT; 

2. x may violate the multiplicity constraint: i.e., Xi > di for some £; 

3. x may violate the covering constraints: i.e., ■ x < for some k. 


These three criteria are in competition. For our purposes, we will demand that our solution x 
completely satisfies the covering constraints. We will seek to satisfy the multiplicity constraints 
and optimality constraint as closely as possible. Our emphasis will be on the optimality constraints, 
that is, we seek to ensure that 

C • x < p x OPT 

where /? > 1 is “small”. The parameter (3, in this context, is referred to as the approximation ratio. 
More precisely, we will derive a randomized algorithm with the goal of satisfying EfC-rc] < /3 x OPT, 
where the expectation is taken over our algorithm’s randomness. 

Many approximation algorithms for set cover and its extensions give approximation ratios as a 
function of m, the total number of constraints: e.g., it is known that the greedy algorithm has 
approximation ratio (1 — o(l))lnm [18] . We often prefer a scale-free approximation ratio, that 
does not depend on the problem size but only on its structural properties. Two cases that are 
of particular interest are when the matrix A is row-sparse (a bounded number of variables per 
constraint) or column-sparse (each variable appears in a bounded number of constraints). We will 
be concerned solely with the column-sparse setting in this paper. The row-sparse setting, which 
generalizes problems such as vertex cover, typically leads to very different types of algorithms than 
the column-sparse setting. 

Two common parameters used to measure the column sparsity of such systems are the maximum 
Iq and l\ norms of the columns; that is, 

Ao = max jfk : A^ >0, Ai = maxN A^ 
i i J 

k 

Since the entries of A are in [0,1], we have Ai < Ao; it is also possible that Ai <C Ao- 

Approximation algorithms for column-sparse CIPs typically yield approximation ratios which are a 
function of Ao or Ai, and possibly other problem parameters as well. These algorithms fall into two 
main classes. First, there are greedy algorithms: they start by setting x = 0, and then increment 
Xi where i is chosen in some way which “looks best” in a myopic way for the residual problem. 
These were first developed by |3j for set cover, and later analysis (see 0) showed that they give 
essentially optimal approximation ratios for set cover. These were extended to CIP in [8j and [5], 
showing an approximation ratio of 1 + In Aq . These greedy algorithms are often powerful, but they 
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are somewhat rigid. In addition, the greedy algorithms do not yield “oblivious” approximation 
ratios — that is, the greedy algorithm can only operate with knowledge of the objective function. 

An alternative, and often more flexible, class of approximation algorithms is based on linear relax¬ 
ation. There are a number of possible linear relaxations, but the simplest is one which we refer to 
as the basic LP. This LP has the same covering constraints as the original CIP, but replaces the 
constraint x, € {0,1,... ,df\ with the weaker constraint x; t E [0, df\. The set of feasible points to 
the basic LP is a polytope, and one can find its exact optimal fractional solution x. As this is a 
relaxation, we have C ■ x < OPT. It thus suffices to turn the solution x into a random integral 
solution x satisfying E [C ■ x] < f3(C ■ x). We will also see some stronger LP formulations, such as 
the Knapsack-Cover (KC) inequalities. 

Randomized rounding is often employed to transform solutions to the basic LP back to a feasible 
integral solution. The simplest scheme, first applied to this context by m , is to simply draw x* 
as independent Bernoulli(o;Xj), for some a > 1. When this is used, simple analysis using Chernoff 

bounds shows that A^-x > a simultaneously for all k when a > l + co( logm + A/ ogm ), where Co > 0 

a k Y a k 

is some absolute constant. Thus, the overall solution C-x is within a factor of 1 + 0( 1 ° s, “ + 

^min y a min 

from the optimum, where a m ; n = min/,, a*, > 1. One noteworthy aspect here is that this randomized 
rounding does not depend on the specific objective function; in this sense, it is “oblivious”, yielding 
a good expected value for any objective function. 

In [19], Srinivasan gave a scale-free method of randomized rounding (ignoring multiplicity con¬ 
straints), based on the FKG inequality and some proof ideas behind the Lovasz Local Lemma 

(LLL) . This gave an approximation ratio of 1 + O ( lo g( Ao+1 ) _|_ / ioga min _j_ iog(A 0 +p\ . rp^g rounc ji n g 

' ^min V ®min ®min ' 

scheme, by itself, only gave a positive (exponentially small) probability of achieving the desired 
approximation ratio. The algorithm of m also included a polynomial-time derandomization using 
the method of conditional expectations; this derandomization howeve requires knowledge of the 
objective function. 

The algorithm of Srinivasan can potentially cause a large violation in the multiplicity constraints. In 
El, Kolliopoulos & Young considered how to modify the algorithm of m to respect the multiplicity 
constraints. They gave two algorithms, which offer different types of approximation ratios. The 
first algorithm takes parameter e € ( 0 , 1 ], violates each multiplicity constraint “x* < di” to at most 
< |"(l + e)di]”, and has approximation ratio of 0(1 + lo ^ Ao ^ )■ (We refer to this situation as e- 
respect multiplicity.) The second algorithm meets the multiplicity constraints exactly and achieves 
approximation ratio 0(1 + log Aq). 


1.1 Our contributions 

In this paper, we give a new randomized rounding scheme, based on the partial resampling variant 
of the LLL developed in m and some proof ideas developed in [9j for the LLL applied to systems 
of correlated constraints. We show the following result: 

Theorem 1.1. Suppose we have fractional solution x for the basic LP. Let 7 = hhAi+Il i Then our 
randomized algorithm yields a solution x E Z” satisfying the covering constraints with probability 
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one, and with 


E[xj] < Xi{ 1 + 7 + 4^) 


The expected running time of this rounding algorithm is O(rnn). 


This automatically implies that E[C • x\ < /3C ■ x < /3 x OPT for (5 = 1 + 7 + 4 ^/ 7 . Our algorithm 
has several advantages over previous techniques. 

1. We give approximation ratios in terms of Ai, the maximum 1\ -norm of the columns of A. 
Such bounds are always stronger than those phrased in terms of the corresponding Zo-norm. 

2. When Ai is small, our approximation ratios is asymptotically smaller than that of [19]. In 
particular, we avoid the . / logQmin term in our approximation ratio. 

y ®min 

3. When Ai is large, then our approximation ratio is roughly 7 ; this is asymptotically optimal 
(including having the correct coefficient), and improves on [19]. 

4. This algorithm is quite efficient, essentially as fast as reading in the matrix A. 

5. The algorithm is oblivious to the objective function — although it achieves a good approxi¬ 
mation factor for any objective C, the algorithm itself does not use C in any way. 


We find it interesting that one can “boil down” the parameters Ai, a m i„ into a single parameter 7 , 
which seems to completely determine the behavior of our algorithm. 

Our partial resampling algorithm in its simplest form could significantly violate the multiplicity 
constraints. By choosing slightly different parameters for our algorithm, we can ensure that the 
multiplicity constraints are nearly satisfied, at the cost of a worsened approximation ratio: 

Theorem 1.2. Suppose we have a fractional solution x for the basic LP. Let 7 = ln ( Al + 1 ) _ p or 

^min 

any given e € (0,1], our algorithm yields a solution x € Z” satisfying the covering constraints with 
probability one, and with 


Xi < \xi(l + e)], E[xi] < Xi(l + 4^ +47/e) 


This is an asymptotic improvement over the approximation ratio of m, in three different ways: 

1. It depends on the £ 1 -norm of the columns, not the £q norm; 

2. When 7 is large, it is smaller by a full factor of 1/e; 

3. When 7 is small, it gives an approximation ratio which approaches 1, at a rate independent 
of e. 


The two previous approximation ratios are all given in terms of the basic LP. We also give an 
approximation algorithm based on the KC inequalities, which is a stronger linear relaxation than 
the basic LP. This gives a different type of asymptotic guarantee, which is phrased in terms of the 
optimal integral solution (not the optimal basic LP solution): 
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Theorem 1.3. There is a randomized algorithm running in expected polynomial time, yielding a 
solution x € Z” which satisfies the covering constraints, multiplicity constraints, and has 

C ■ x < (1 + In A 0 + 0(0og A 0 )) OPT 

This improves over the the corresponding approximation ratio of m , in that it achieves the optimal 
leading coefficient of In Ao- 

There are many ways of parametrizing CIP’s; we have chosen to focus on the parameters such 
as the minimum RHS value a m ; n , the maximum £i -column norm Ai, and most importantly the 
ratio ln(Ai + l)/a m i n . Our approximation ratios are functions of these parameters; we show a 
number of matching lower bounds, which demonstrate that one cannot obtain significantly improved 
approximation ratios which are parametrized as functions of these same parameters. The formal 
statements of these results contain numerous qualifiers and technical conditions, but we summarize 
these here. 

1. When 7 is large, then assuming the Exponential Time Hypothesis, any polynomial-time 
algorithm to solve the CIP (ignoring multiplicity constraints), whose approximation ratio is 
parametrized as a function of 7 , must have approximation ratio 7 — 0 (log 7 ). 

2. When 7 is large, then the integrality gap between the basic LP and integral solutions which 
e-respect multiplicity, is of order ( 7 /e). 

3. When 7 is small, then the integrality gap of the basic LP is 1 + 12 ( 7 ). 

In this sense, our approximation algorithms are nearly optimal as functions of the parameters 7 , e. 
On the other hand, there are many alternate parameters which could be analyzed instead, and 
alternate approximation ratio guarantees given (which would be incomparable to ours.) 

Finally, we give an extension to covering programs with multiple linear criteria. Specifically, we 
show that even conditional on our solution x satisfying all the covering constraints, not only do we 
have E[C; • x] < (3Ci ■ x but that in fact the values of Ci • x are concentrated, roughly equivalent to 
the Xi being independently distributed as Bernoulli with probability /3xj. Thus, for each l there is 
a very high probability that we have Ci ■ x ~ Ci ■ x and in particular there is a good probability 
that we have Ci ■ x ~ C; • x simultaneously for all l. 

Theorem 1.4 (Informal). Suppose we are given a covering system with a fractional solution x and 
with r objective functions C\,... ,C r , whose entries are in [0,1] and such that Ct ■ x > fl(logr) 
for all l = 1,... ,r. Let 7 = !AAi+H_ Then our solution x satisfies the covering constraints with 
probability one; with probability at least 1/2, 

W CfX< p{C t ■ x) + 0(y/P(C t -x) log r) 

where /3 = 1 + 7 + 4 ^/ 7 . (A similar result is possible, if we also want to ensure that Xi < [~x,(l + e)~|; 
then the approximation ratio is 1 + 4 ^/ 7 + 47 /e.) 

This significantly improves on m, in terms of both the approximation ratio as well as the running 
time. Roughly speaking, the algorithm of m gave an approximation ratio of 0(1 + lo sP +A °) ) 
(worse than the approximation ratio in the single-criterion setting) and a running time of n°( logr ) 
(polynomial time only when r, the number of objective functions, is constant). 
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1.2 Outline 


In Section [21 we develop a randomized rounding algorithm when the fractional solution satisfies 
x E [0, l/a) n ; here a > 1 is a key parameter which we will discuss how to select in later sections. 
This randomized rounding produces produces a binary solution vector x € {0, l} n , for which 
E[xj] « axi. 

In Section [3j we will develop a deterministic quantization scheme to handle fractional solutions of 
arbitrary size, using the algorithm of Section [2] as a subroutine. We will show an upper bound on 
the sizes of the variables Xi in terms of the fractional x % . We will also show an upper bound on 
E [x'j], which we state in a generalized form without making reference to column-sparsity or other 
properties of the matrix A. 

In Section [4j we consider the case in which we have a lower bound a m i n on the RHS constraint 
vectors a as well as an upper bound Ai on the £i-nornr of the columns of A. Based on these 
values, we set key parameters of the rounding algorithm, to obtain good approximation ratios as a 
function of a m ; n , Ai. These approximation ratios do not respect multiplicity constraints. 

In Section [5[ we extend these results to take into account the multiplicity constraints as well. 
We give two types of approximation algorithms: in the first, we e-respect the the multiplicity 
constraints. In the second, we respect the multiplicity constraints exactly. 

In Section [6l we construct a variety of lower bounds on achievable approximation ratios. These are 
based on integrality gaps as well as hardness results. These show that the approximation ratios 
developed in Section [4] are essentially optimal for most values of e and ln(Ai + l)/a m i n , particularly 
when ln(Ai) a m i n . 

In Section [71 we show that our randomized rounding scheme obeys a negative correlation property, 
allowing us to show concentration bounds on the sizes of the objective functions Ci ■ x. This 
significantly improves on the algorithm of mi; we show asymptotically better approximation ratios 
in many regimes, and we also give a polynomial-time algorithm regardless of the number of objective 
functions. 


1.3 Comparison with the Lovasz Local Lemma 

One type of rounding scheme that has been used for similar types of integer programs is based on 
the LLL; we contrast this with our approach taken here. 

The LLL, first introduced in [6], is often used to show that a rare combinatorial structure can 
be randomly sampled from a probability space. In the basic form of randomized rounding, one 
must ensure that the probability of a “bad-event” (an undesirable configuration of a subset of the 
variables) — namely, that A^ ■ x < — is on the order of 1/m; this ensures that, with high 

probability, no bad events occur. This accounts for the term log m in the approximation ratio. The 
power of the LLL comes from the fact that the probability of a bad-event is not compared with 
the total number of events, but only with the number of events it affects. Thus, one may hope to 
show approximation ratios which are independent of m. 
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At a heuristic level, the LLL should be applicable to the CIP problem. We have a series of bad- 
events, one for each covering constraint. Furthermore, because of our assumption that the system 
is column-sparse, each variable only affects a limited number of these bad-events. Thus, it should 
be possible to use the LLL to obtain a scale-free approximation ratio. 

There has been prior work applying the LLL to packing integer programs, such as M- One 
technical problem with the LLL is that it only depends on whether bad-events affect each other, not 
the degree to which they do so. Bad-events which are only slightly correlated are still considered as 
dependent by the LLL. Thus, a weakness of the LLL for integer programs with arbitrary coefficients 
(i.e. allowing Aki € [ 0 , 1 ]), is that potentially all the entries of Aki could be extremely small yet 
non-zero, causing every constraint to affect each other by a tiny amount. For this reason, typical 
applications of the LLL to column-sparse integer programs have been phrased in terms of the Iq 
column norm Ao- For packing problems with no constraint-violation allowed, good approximations 
parametrized by Ao, but not in general by Ai, are possible [lj. 

In mi, Harvey addressed this technical problem by applying a careful, multi-step quantization 
scheme with iterated applications of the LLL, to discrepancy problems with coefficient matrices 
where the l\ norm of each column and each row is “small”. 

The LLL, in its classical form, only shows that there is a small probability of avoiding all the 
bad-events. Thus, it does not lead to efficient algorithms. In |15j . Moser &; Tardos solved this long¬ 
standing problem by introducing a resampling-based algorithm. This algorithm initially samples 
all random variables from the underlying probability space, and will continue resampling subsets of 
variables until no more bad-events occur. Most applications of the LLL, such as HU, would yield 
polynomial-time algorithms using this framework. 

In the context of integer programming, the Moser-Tardos algorithm can be extended in ways which 
go beyond the LLL itself. In [10], Harris &; Srinivasan described a variant of the Moser-Tardos 
algorithm based on “partial resampling”. In this scheme, when one encounters a bad-event, one only 
resamples a random subset of the variables (where the probability distribution on which variables 
to resample is carefully chosen). This was applied for “assignment-packing” integer programs with 
small constraint violation. These bounds, like those of mi, depend on Ai. 

It is possible to formulate the CIP problem in the LLL framework, and to view our algorithm as 
a variant of the Moser-Tardos algorithm. This would achieve qualitatively similar bounds, albeit 
with asymptotics which are noticeably worse than the ones we give here. In particular, using the 
LLL directly, one cannot achieve approximation factors of the form 1 + 7 when 7 —>• 00; one obtains 
instead an approximation ratio of 1 + 07 where c is some constant strictly larger than one. The case 
when 7 —> 0 is more complicated and there the LLL-based approaches appear to be asymptotically 
weaker by super-constant factors. 

The technical core of our algorithm is an adapation of the partial resampling MT algorithm of [10] 
combined with a methodology of [9] to yield improved probabilistic guarantees for LLL systems 
with correlated constraints. These techniques can only be used when the original fractional solution 
has entries which are small (and hence can be interpreted as probabilities); we develop a novel 
preprocessing step to handle large fractional entries which giving good guarantees on the multiplicity 
constraints. 

Because so many different problem-specific techniques and calculations are combined with a variety 
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of LLL techniques, it is cumbersome to derive our algorithm directly as a special case or corollary of 
results from m- For the most part, we will discuss our algorithm in a self-contained way, keeping 
the comparison with the LLL more as informal motivation than technical guide. 


2 The RELAXATION algorithm 

We first consider the case when all the values of x are small; this turns out to be the critical case 
for this problem. In this case, we present an algorithm which we label RELAXATION. Initially, 
this algorithm draws each Xi as an independent Bernoulli trials with probability p% = acti , for some 
parameter a > 1. This will satisfy many of the covering constraints, but there will still be some left 
unsatisfied. We loop over all such constraint; whenever a constraint k is unsatisfied, we modify the 
solution as follows: for each variable i which has Xi = 0, we set Xi to be an independent Bernoulli 
random variable with probability aAkiaxi. Here a E [0,1] is another parameter which we will also 
discuss how to select. 

For the remainder of this Section [J| we assume throughout that a E [0,1] and a > 1 are given 
parameters, and that in addition we have X{ < 1 /a for all i E [n]. We assume also that x satisfies 
the covering constraints, i.e. A^ ■ x > a*, for all k = 1,..., m. These assumptions will not be stated 
explicitly in the remainder. 


Algorithm 1 The RELAXATION algorithm 

1 

function RELAXATION (x, A, a, a, a) 


2 

for i from 1 ,... ,n do 

> Initialization 

3 

Xi ~ Bernoulli(o:f j) 


4 

while A - x ft a do 

> The covering constraints are not all satisfied 

5 

Let k be minimal such that A^ ■ x < a& 


6 

for i from 1,... , n do 


7 

if Xi = 0 then 


8 

Xi ~ Bemoulli(crAfcjQ:Xi) 


9 

return x 



Note that this algorithm only increments the variables. Hence, when a constraint k is satisfied, it 
will remain satisfied until the end of the algorithm. 

Whenever we encounter an unsatisfied constraint k and draw new values for the variables (lines 
6 -8), we refer to this as resampling the constraint k. There is an alternative way of looking at the 
resampling procedure, which seems counterintuitive but will be crucial for our analysis. Instead of 
setting each variable Xi = 1 with probability aA^axi, we instead select a subset Y C [n], where 
each i currently satisfying Xi = 0 goes into Y independently with probability aA^i. Then, for each 
variable i E Y, we draw x t ~ Bernoulli (jpi), where pi = axi. It is clear that this two-part sampling 
procedure is equivalent to the one-step procedure described in Algorithm 1. In this case, we say 
that Y is the resampled set for constraint k. If i E Y (for any constraint k) we say that variable i 
is resampled. 

For every variable i, we either have Xi = 1 at the initial sampling, or Xi first becomes equal to one 






during some resampling of a constraint k ; or Xi = 0 at the end of the algorithm. If Xi = 1 for the 
first time at the resampling of constraint k, we say i turns at ( k,j ). If Xi = 1 initially, we say 
that i turns at 0. 


In the algorithm as we have described, the first step is to set the variables Xi as independent 
Bernoulli with probability pi. Our analysis, following [TO] and [9], is based on an inductive argument, 
in which we consider what occurs when x is set to some arbitrary value. If A ■ x > a, then the 
algorithm is already finished. If not, there will be a series of modifications made to x until it 
terminates. Given any fixed value of x, we will show upper bounds on the probability of certain 
future events. 

Lemma 2.1. Let Z \,..., Zj be subsets of [n]. The probability that the first j resampled sets for 
constraint k are respectively Z \,..., Zj is at most uh fk(zi), where we define 


fk(Z) 


a -<?r ak na-A^n 

iE[n] i£Z 


(1 — Pi)A ki a 

1 - A ki o 


Proof. For any integer T > 0, any integer j > 0, any sets Z\.. . . , Zj C [n] and any vector 
v E {0, l} n , we define the following random process and the following event £(T, j, Z i,..., Zj,v ): 
Suppose that instead of drawing x ~ Bernoulli {ax/) as in line 3 of RELAXATION, we set x = v, 
and we continue the remaining steps of the RELAXATION algorithm (lines 4-8) until done. We 
say that in this process event £{T,j, Z\,..., Zj,v) has occurred if: 


1. There are at < T total resamplings 

2. There are at least j resamplings of constraint k 

3. The first j resampled sets for constraint k are respectively Z\,.... Zj. 


We claim now that for any Z±, ..., Zj, and v € {0, l} n , 

P{£(T,j,Z 1 ,...,Z j ,v)) < 


and any integer T > 0, we have 

YIU fk(Zi) 

ieZiU-uZj ( l ~Pi ) 


(1) 


(Note that Pi < 1 by our assumption x t < 1/a , and so the RHS of (HD is always well-defined.) 

We shall prove ([T]) by induction on T. For the base case (T = 0) this is trivially true, because 
£(T,j , Z\,... , Zj,v) is impossible (there must be at least 0 resamplings), and so the LHS of dTJ is 
zero while the RHS is non-negative. We move on to the induction step. 

If Av > a, then the RELAXATION algorithm performs no resamplings. Thus, if j > 1, then event 
£(T,j , Z\..... Zj.v) is impossible and again (P) holds. On the other hand, if j = 0, then the RHS 
of (HD is equal to one, and again this holds vacuously. So we suppose Av a; let k' be minimal 
such that A k 'V < a k '. Then the first step of RELAXATION is to resample constraint k'. 

We observe that if v l = 1 for any i E Z\ U • ■ ■ U Zj , then the event £ (T, j, Z \,..., Zj , v ) is impossible. 
The reason for this is that we only resample variables which are equal to zero; thus variable i can 
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never be resampled for the remainder of the RELAXATION algorithm. In particular, we will never 
have i in any resampled set. Thus, as i € Z\ U • • • U Zj, it is impossible for Z\, ..., Zj to eventually 
be the resampled sets for constraint k. So if V{ = 1 for any i G Z\ U • • • U Zj then JT|) holds vacuously. 

Let x' denote the value of the variables after the first resampling {x' is a random variable). Then 
we observe that the remaining steps of the RELAXATION algorithm are equivalent to what would 
have occurred if we had set x = x' initially. 


Now, suppose that k! / k. Then after the first resampling, the event £(T. j, Z\..... Zj, v) becomes 
equivalent to the event £(T — 1, j, Z \,..., Zj,x'). Thus, in this case, we have 


P(£{T,j,Z±,... ,Zj,v )) 


= P(£(T- l,j,Zi,..., 

^ UL fk(Zj) 
riieZiu-uZj(l ~ Pi) 


Zj,x')) 


induction hypothesis 


and this shows the induction step as desired. (Note that here we are able to bound the probability 
of the event £(T—l,j,Z\,...,Zj,x'), even though x' is a random variable instead of a fixed vector, 
because our induction hypothesis applies to all vectors v € {0, l} n . 


Next, suppose that k = k'. In this case, we observe that the following are necessary events for 
£(T,j, Zi,..., Z j: v): 


(Bl) Y = Z i, where Y is the first resampled set for constraint k! = k. 

(B2) For any i € Z\ n (Z <2 U • • • U Zj), in the first resampling step (which includes variable i), we 
draw Xi = 0. 

(B3) £{T - l,j - 1,Z 2 ,Z 3 ... ,Zj,x') 


The condition (B2) follows from the observation, made earlier, that £(T— 1, j — 1, Zi , Z 3 ..., Zj,x') 
is impossible if x\ = 1 but * E Z^ U • • • U Zj. Any such i E Z 1 must be resampled (due to condition 
(Bl)), and it must be resampled to become equal to zero. 

Let us first bound the probability of the condition (Bl). As we put each i into Y with probability 
Afcjcr independently, the probability that all i € Z\ go into Y is flieZi Akicr. By the same token, if 
Vi = 0, then i avoids going into Y with probability 1 — A^c t. Therefore, the overall probability of 
selecting Y = Z\ is given by 

P(Y = Z{)=H A ki a n (! - A ki<r) 

i£Zi i(£Zi,Vi=0 

= ( n Aki °) ( n ^ _ A ki°)) ( n (1 — A ki cr) ^ (as Vi = 0 for all i € Z\ ) 

ieZi i£Z\ Vi =1 

= n (i - a ^ n 1 t ki A k:(J n ^ - a ^ x 


By definition of k!, we have that A k v < a k - By Proposition I A. 11 we thus have: 

JJ (1 - A ki a)~ l < (1 - cr)~ ak 

i:Vi = l 
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further implying: 


(2) 


P(Y = Zr) < (1 - *)—> n (1 - 4*0 IT Y^~a 



Next, let us consider the probability of (B2), conditional on (Bl). Each i E Y is drawn indepen¬ 
dently as Bernoulli-jjj; thus, the total probability of event (B2), conditional on (Bl), is at most 


llieZin(Z2U-uZj)( 1 Pi)- 

Finally, let us consider the probability of event (B3), conditional on (Bl) and (B2). Observe that 


the event £(T — 1 ,j — 1, Z- 2 , Z 3 ..., Zj,x') is conditionally independent of events (Bl) and (B2), 
given x'. By the law of total probability, we have 


P(£(T-l,j-l,Z 2 ,Z 3 ...,Z j ,x') | (B1),(B2)) 

= P(S(T-l,j-l,Z 2 ,Z 3 ...,Z j ,i/)P(x/ = i/) 


u'e{o,i}™ 


- ri ;=2 fk(Zi) - p ^j = ^ induction 



induction hypothesis 



u'efo.i}" 


n Ufk(Zi) 


ILe.Z2U---uZj(l Pi) 


Thus, as (Bl), (B2), and (B3) are necessary conditions for £(T,j , Z \,..., Zj,v ), we have 


P(S(T,j, Z 1 ,.. .,Zj,v)) 


n Ufk{Zi) 


< ci - ff r B ‘ TT n - p TT Aki ° TT n - no x 



riieZ2U-uZj(^ Pi) 


and the induction claim again holds. 

Thus, we have shown that (HD holds for any integer T > 0 and any Z \,..., Zj, and v E {0, l} n . 
Next, for any sets Z\,... , Zj and any v E {0, l} n , let us define the event £(j,Z 1,..., Zj, v) to be the 
event that, if we start the RELAXATION algorithm with x = v, then the first j resampled sets for 
constraint k are respectively Z\,.... Zj\ we make no condition on the total number of resamplings. 
Observe that we have the increasing chain 



and £(j, Zi,..., Zj,v) = UtLo £(T,j, Z\, ..., Zj, v). By countable additivity of the probability 
measure, we have: 
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So far, we have computed the probability of having Z \,..., Zj be the first j resampled sets for 
constraint k, given that x is fixed to an arbitrary initial value v. We now can compute the probability 
that Z\,..., Zj are the first j resampled sets for constraint k given that x is drawn as independent 
Bernoulli-/^. 


In the first step of the RELAXATION algorithm, we claim that a necessary event for Z\,... ,Zj 
to be the first j resampled sets is to have Xi = 0 for each i £ Z\ U • • • U Zj] the rationale for this 
is equivalent to that for (B2). This event has probability Yliez 1 u-uz j (1 ~ Pi)- Subsequently the 
event P(£(j, Z\, ..., Zj,x)) must occur. 


The probability of £(j, Zi,, Zj,x), conditional on x* = 1 for all i £ Z\ U • • • U Zj, is at most 

IT j r / ry \ 

— 1=1 k (by a similar argument to that of computing the probability of (B3) conditional 


IligZjU -UZj ' 

on (Bl), (B2)). Thus, the overall probability that the first j resampled sets for constraint k are 
Z \,..., Zj is at most 

P(Z\...., Zj are first k resampled sets) < (1 — pf) x == — ^~ /=1 - - r = ]^[ fk( z l) 

ieZiU-uZj iUeZtu-uzfi^ Pi) l=1 


as desired. 


□ 


We next compute Ylzc\n] fk( z )] such sums will recur in our calculations. 

Proposition 2.2. Suppose a > ~ ln ^ . For any constraint k define 

s k = (1 - a ya kfr aaA k .x 


Then Ezc[n] fk( z ) < s k < 1 for all k = 1,... , m. 


Proof. We have 


x aw - x <i - -r* n a - **> n Vj’ffr 

ZC.[n] ZG[n] iE[n\ i£Z w 

=(i- f pnc-v)Eni^ 

(1 ~Pi)A ki cr 

1 - A ki a 


iE\n\ 


ZC[n\ ieZ 


n 

=(i - n - -4^) n( 


i + 


iGn 


2=1 


(1 - a) ak (1 - A ki piO) 




< (]_ _ a )~ a k e ~°Y,i A kiPi — ^ _ a ^-ak e -°aA k -x 


Also, noting that x satisfies the covering constraints (i.e., A k ■ x > a k ), we have that 

S k = (1 - a )-“k e -™A k -x < ^ _ a ya ke -aa k -^f- a) = 


□ 
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Proposition 2.3. For any constraint k and any i € [n], we have 


E fk(Z) < s k A ki a 

ZC.[n\,Z3i 


Proof. We have: 


E 

ZC[n\ 

ZBi 


Ed 

ZC[n\ 

ZBi 


- <’)-“• nn-n 

l G [ n ] l G Z 


(1 -pi)A ki a 

1 - A k icr 


(1 

(1 


._a h {l - Pi)A ki a 

1 - A ki a 


n a -A U a) £ n 

/G[n] —{ 2 } Z<Z[n],Z3i l£Z— {i} 


(1 - pi)A k icr 
1 - A k[ o 


cr) ak (l -pi)A ki a (1 ~ A k i<j) PJ (l + ^ 

ze[n]-{i} ze[n]-{i} kl<T 


(1 - (r)~ ak (1 - Pi)A ki a (1 - A ki pio) 

(1 - a)~ ak {l - Pi )A ki ae aAkiPt e- aa{Ak ' £) 

Sk( 1 - Pi)A ki ae aAkiPi 


Now note that A kt < 1,<t < 1 and hence (1 — pf)e aAkiPi < 1. The claimed bound then holds. □ 

To gain some intuition about this expression s k , note that if we set a = 1 — 1/a (which is not 
necessarily the optimal choice for the overall algorithm), then we have 

s k = a *k e -A k .x(a- 1 ) 

and this can be recognized as the Chernoff lower-tail bound. Namely, this is an upper bound on 
the probability that a sum of independent [0, l]-random variables, with mean aA k ■ x, will become 
as small as a k . This makes sense: for example at the very first step of the algorithm (before any 
resamplings are performed), then A k ■ x is precisely a sum of independent Bernoulli variables with 
mean aA k ■ x. The event we are measuring (the probability that a constraint k is resampled) is 
precisely the event that this sum is smaller than a k . 

We next bound the running time of the algorithm. 

Proposition 2.4. Suppose a > _ 27ie expected number of resamplings steps made by the 

algorithm RELAXATION is at most • 

Proof. Consider the probability that there are > l resamplings of constraint k. A necessary condi¬ 
tion for this to occur is that there are sets Z[,.... Z/ such that Z\,..., Z/ are respectively the first 
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I resampled sets for constraint k. Taking a union-bound over Z\,..., Zi, we have: 


P(> l resamplings of constraint k) < 22 P(Z i,... ,Z\ are first resampled sets for constraint k) 

Zi,—,ZiQ[n\ 

< 22 fk(Z\) ■ ■ ■ f k (Zi) (by Lemma [EE]) 

Zi,—,ZiC[n\ 

= (£ fk(z)Y 

ZC[n] 

< s\. (by Proposition 12.211 


As Sfc < 1, this implies that the expected number of resamplings of constraint k is at most 


OO 


22 s k 


i 

l/Sfc -1 


i 

(1 — cr) a fc Q—vaAk-x 


□ 

We also give a crucial bound on the distribution of the variables Xi at the 
process. 

Theorem 2.5. Suppose Ah ■ x > ah for all k = 1 ,m and a > — ln ^ ^ 
the probability that Xi = 1 at the conclusion of RELAXATION algorithm is 

P(xi = 1) < axi ( 1 + a - j — . kl —r--) 

k 

Proof. There are two possible ways to have Xi = 1: either i turns at 0 or it turns at ( k,j ) for some 
k € [to], j > 1. The former event has probability pi. 

Suppose that i turns at ( k,j ). In this case, there must be sets Z±,..., Zj such that: 

(Cl) The first j resampled sets for constraint k are respectively Z\,.... Zj 
(C2) i € Zj 

(C3) During the jth resampling of constraint k , we set Xi = 1. 

Now, observe that the probability of (C3), conditional on (Cl), (C2), is pi. The reason for this is 
that event (C3) occurs after (Cl), (C2) are already determined. Thus, we can use time-stochasticity 
to compute the conditional probability. 

For any fixed k G [to] and any fixed sets Z\ ,..., Zj , the probability that Z\.... . Z 3 are the first j 
resampled sets is at most fk{Z\) ■ ■ ■ fk(Zj) by Lemma l 2 Tl 

Thus, in total, the probability that events (C1)-(C3) hold for a fixed Z \,..., Zj where i € Zj, is at 
most Pifk(Zi) ■ ■ ■ fk{Zj). 


end of the resampling 

. Then for any i € [n], 
at most 
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We now take a union bound over all k € [rn] and all integers j > 1 and all sets Z\,.... Zj C [n] 
with i € This gives: 


P(Xi = 1) < Pi + EE E PifkiZx)... f k (Zj) 


k=l j =1 Zi,...,ZjC[n] 
i£.Zq 


Pi( 1 + E E A(^) E E f k {Z 1 )...f k {Z j _ 1 ) 


k=lZC{n],Z3i j=l Z 1 ,...,Z j C[n] 

Zi=Z 


= p %( 1 

+E 


k =1 

= pi( 1 

m 

+E 


k =1 

= Pi( 1 

m 

+E 


k =1 

<pi{ 1 

771 

+E 


k =1 

= aii ( 

( l +<J 


J 3-1) 


YhzC[n],Z^i fk{Z) 

1 - Sfc 

Sk-^-kiG , 


) as Sfc < 1 


$k 


A 


ki 


gO'cx.Ak'X ^ 2 _ — — 2 


□ 


3 Extension to the Case Where Xi is Large 


In the previous section, we described the RELAXATION algorithm under the assumption that 
ah < 1/a for all i. This assumption was necessary because each variable i is chosen to be drawn as 
a Bernoulli random variable with probability pi = aii. In this section, we give a rounding scheme 
to cover fractional solutions x of unbounded size. We first give an overview of this process. 

Our goal is to extend the approximation ratio pi = a^l + a Y/ k e <T a A k .5?^/i a Yk_i ) Section [2j 
First, note that if we have a variable i, and a solution to the LP with fractional value x t . we can 
sub-divide it into two new variables yi,U 2 with fractional values y\ , fj 2 such that y\ +y 2 = x\. Now, 
whenever the variable Xi appears in the covering system, we replace it by y\ + y 2 - This process of 
sub-dividing variables can force all the entries in the fractional solution to be arbitrarily small. We 
can run the RELAXATION algorithm on this subdivided fractional solution, obtaining an integral 
solution y 1 , 2/2 an d hence Xi = y\ + 2/2 • Observe that the approximation ratios for the two new 
variables both equal to pi itself. Thus E[xj] = E[yi + 2 / 2 ] < PiPi + PiP 2 < Pi&i. 

By subdividing the fractional solution, we can always ensure that we obtain the same approximation 
for the general case (in which x is unbounded) as in the case in which x is restricted to entries of 
bounded size. However, this may violate the multiplicity constraints: in general, if we subdivide a 
fractional solution x-i into yi,... ,yi, and then set Xi = y\ + • • • + yi , then x ? ; could become as large 
as l. 
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There is another, simpler way to deal with large values xp. for any variable with Xj > 1/a, simply 
set x' ? ; = 1. Then, we certainly are guaranteed that E[x*] < ax* < piXi. Let us see what problems 
this procedure might cause. Consider some variable i with Xi = r > 1/aQ Because we have fixed 
x'j = 1, we may remove this variable from the covering system. When we do so, we obtain a residual 
problem A', a', in which the ith column of A is replaced by zero and all the RHS vectors are 
replaced by a' k = ak — Aki■ 


Suppose that variable i appears in constraint k with another variable i' with Aki = 1. We want to 
bound E[x£] in terms of /%/; to do so, we want to show that constraint k contributes e acA k -i:^ (7 ^a k _ 1 
to E[x']. Now, in the residual problem, we replace a*, with a^—l and we replace A^-x with A^-x—r. 
Thus, constraint k contributes the following to /3',: 


Contribution 


-Aki' 

e <ja{A k -x-r) _ ^afc-l _ l 


Aki' 

e aa(A k -x)(l _ e -crar _ cr)” 1 — 1 


Observe that if r > — ln ^ , then this is larger than the original contribution term we wanted 
to show, namely pi = e a a A^i k ^_ a y k ■ Thus, there is a critical cut-off value 9 = ~ ln ^~ ir ; when 
Xi > 9, then forcing x, = 1 gives a good approximation ratio for variable i but may have a worse 
approximation ratio for other variables which interact with it. 


We can now combine these two methods for handling large entries of Xj. For any variable i , we first 
subdivide variable i into multiple variables yi,...yi with fractional value 9, along with one further 
entry yi + \ E [0,0]. We immediately set yi,---,yi = 1. If yi+\ > 1/a, we set yi + \ = 1 as well, 
otherwise we will apply the RELAXATION algorithm for it. At the end of this procedure, we know 

that Xi = 2/1 H-b yi+\ < (1 + 1) = m- We also know that E [xj] < a(yi H-f yi) + pm+i < 

Pi(fji + • • • + = Pi^. Thus, we get a good approximation ratio and a good bound on the 

multiplicity of Xj. 


3.1 The ROUNDING algorithm 


For each variable i, let Vi = \_Xi/9\ , where we define 

— ln(l — a) 


9 = 


aa 


We define F, = x,- — v.,9 which we can write as Fj = x,- mod 9. We also define: 


Gi = 


0 if Fi < 1/a 
1 if U > 1/a ’ 


Xi = 


Fi if Fi < 1/a 
0 if Fi > 1/a 


We form the residual problem a' k = a^ — JT A^iGi+Vi). We then run the RELAXATION algorithm 
on the residual problem, which satisfies the condition that x\ E [0, l/a] n . This is summarized in 
Algorithm 2. 

1 To gain intuition, the reader may consider the case in which r > 1. In this case, it is obvious that this is a bad 
rounding procedure. It is instructive to trace through exactly why it fails badly. 
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We begin by showing a variety of simple bounds on the variables before and after the quantization 
steps. 

Proposition 3.1. Suppose that x! G {0, l} n satisfies the residual covering constraints, that is, 
Ak ■ x' > a' k for all k = 1 ,..., m. 

Then the solution vector returned by the ROUNDING algorithm, defined by x = G + v + x', satisfies 
the original covering constraints. Namely, Ak ■ x > a^ for all k = 1,..., m. 

Proof. For each k we have: 

Ak • x = Ak ■ ( x' + G + v) = Ak ■ x' + 'y ' AkfiGi + vf) > a! k + ^ ' AkfiGi + vf) = ak 

i i 

□ 

Proposition 3.2. For any i G [n] we have 

Xi — ViO — GiO < x\ < Xi — Vi6 — Gif a 

Proof. If Gj = 0, then both of the bounds hold with equality. So suppose G{ = 1. 

In this case, we have 1/a < Xi — ufi < 0. So x, — v % 0 — Gif a > 0 — 1/a > 0 and x, L — vfi — Gfi 
0 — 0 = 0 as required. 

Proposition 3.3. For any i, at the end of the procedure ROUNDING, we have 

„ aa 

Xi < Xi —-—--- 

— ln(l — a) 

Proof. Note that 1/9 = _ ■ So we must show that x t < \xi/&\. 

First, suppose Xi is not a multiple of 9. Then x t = x\ + Gi + \_Xi/9\. Note that if G* = 1, then 
x[ = 0 which implies that x- = 0. So Gi + Vi < 1 and hence x\ < 1 + \_Xi/9\ = \xi/9]. 
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Next, suppose x t is a multiple of 6. Then Gi = x[ = 0 and so x[ 


= 0 and we have Xi = \xi/0\ = 

□ 

The next result shows that the quantization steps can only decrease the inflation factor for the 
RELAXATION algorithm. Proposition 13.41 is the reason for our choice of 6. 

Proposition 3.4. For any constraint k, we have 

(1 _ a y'k e ' TaA k-x' > (i _ a yk e <raA k -x 

Proof. Let r = - Aki(Gi + Vi). By definition, we have a' k = a^ — r. We also have: 

Ak ■ x ^ ' AkiXi 
i 

> ^2 Aki(£i — Vi6 — Gi,6 ) by Proposition 13.21 

i 

= a k -rO 


Then 


_ Q-^'ke aaAk ' i ' = (1 — ay^k-r e <yo.A k -i' 

> (q _ (J^ a k- r e <TOi(a k -r9) 

= (1 - a)- ak e~ aaak x ((1 - a)e 9aa 
= (1 - a)- ak e~ aaak 


—r 


□ 


We can now show an overall bound on the behavior of the ROUNDING algorithm 

Theorem 3.5. Suppose 
have for each variable i 


Theorem 3.5. Suppose that a > —ML_0 ^ Then at the end of the ROUNDING algorithm, we 


E [xi] < a£i( 1 + 


Aki 


e aaa k ^ _ (J 'ja k _ ^ 


The expected number of resamplings for the RELAXATION algorithm is at most Yfk e ^ aa k(x-a) a k-i ■ 


Proof. Define 


r=l+^T 


-A-ki 


gCrOMk ^^ — (J^ a k — 1 


By Theorem 12.51 the probability that x\ = 1 is at most 


P(x'i = 1 ) < ax'Jl + aY^ 


A-ki 


(1 - a ) a 'ke aaA k-v - 1 
< ax'fT (by Proposition 13.41) 
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So we estimate E[xj] by: 

E[xj] = Vi + Gi + E[a^] 

< + Gi + ax\T 

< Vi + Gi + a(xj — — Gi/a)T (by Proposition 13.21) 

< Vi( 1 — cd?) + afjT 

< ax{T as a6 =---- > 1 

a 

This shows the bound on E[xj]. The bound on the expected number of resamplings is similar. □ 


4 Bounds in terms of a m in, Ai 


So far, we have given bounds on the behavior of ROUNDING algorithm which are as general 
as possible. Theorem 13.51 can be applied to systems which have multiple types of variables and 
constraints. However, we can obtain a simpler bound by reducing these to two simple parameters, 
namely Ai, the maximum G-norm of any column of A, and a m ; n = min*, a*,. We will first assume 
that a m i n > l,Ai > 1. Later, Theorem 14.21 will show that we can always ensure that this holds 
with a simple pre-processing step. 

Theorem 4.1. Suppose we are given a covering system with Ai > 1 ,a m j n > 1 and with a fractional 
solution x. Let 7 = ln ( Ai + 1 ) _ 

C^min 


Then with appropriate choices of a, a we may run the ROUNDING algorithm on this system to 
obtain a solution x € Z" satisfying 


E[xj] < Xi( 1 + 7 + 4^7) 

/ . b + Vi 

Zi - xl WTvL) 


with probability one 


The expected running time of this algorithm is 0{mn). 


Proof. We set a = 1 — 1/a, where a > 1 is a parameter to be determined. Now note that we have 

— lnfl—cr) In a , 

---- = < a. 

cr a—1 


So we may apply Theorem 13.51 for each i € [n] we have: 

E[xjl < XiOl( 1 + (7 7-r —— - ) 

L 1 — V n _ a \a ke cTaa k — \ ) 

k 

= XiOtil + (1 — 1/a) -7 —-t-A -'j 

k 

< Xia(l + (1 — 1/a) -7——-A- 

k 

< Xi(a + (a — 1 )-^ 

— V ' e a.jniu(a-l) a -a min _ l) 
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Now substituting a = 1 + 7 + c l y pf > 1 and a m ; n = ln(Ai + l )/7 gives 


E [xi\ < Xi (l + 7 + 2^/7 + 


( 2 y^y + 7)Ai 
27/7+7—2 111(1+7/7) 

(Ax + 1 ) i 1 - 1 


Proposition IA.2I shows that this is decreasing function of A^. We are assuming a m i n > 1, which 
implies that Ai > e 7 — 1. We can thus obtain an upper bound by substituting Ai = e 7 — 1, yielding 


E [%i] < Xi(l + 7 + 2+7 + 


(+ ~ 1) ( 7 + 2-y/r) 

e 7+ 2 T7 


(3) 


Some simple analysis of the RHS of (|3|) shows that 

E[xj] < Xi( 1 + 7 + 4 + 7 ) 


To show the bound on the size of 37 , we apply Proposition 13.31 giving 


A aa 


x + ^ 

' — ln(l — a) 


*ln(l + V7) 


Next, let us analyze the runtime of this procedure. The initial steps of rounding and forming the 
residual can be done in time 0{mn). By Theorem 13.51 the expected number of resampling steps 
made by the RELAXATION algorithm is at most 


E [Resampling Steps] < 


1 


B cy — 1 


< 

< 


m 

g a min( a 1 )q/ a min — \ 

m 

1 + 2 ^ y -2 ln(l + v^r) 

(Ai + 1 ) t - 1 


< m 


In each resampling step, we must draw a new random value for all the variables; this can be easily 
done in time 0(n). □ 


We now show how to ensure that a m ; n > Ai > 1: 

Theorem 4.2. Suppose we are given a covering system A, a with 7 = ln(Ai + l)/a m * n . Then, in 
time 0{mn), one can produce a modified system A', a' which satisfies the following properties: 

1. The integral solutions of A, a are precisely the same as the integral solutions of A',a'; 

2. o! min > 1 and A; > 1; 

3. We have 7 ' < 7 , where 7 ' = l+A^ + l)/a' min . 
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Proof. First, suppose that there is some entry A k i with A kt > a k - In this case, set A' ki = a k - 
Observe that any integral solution to the constraint A k ■ x > a k also satisfies A' k ■ x > a k , and 
vice-versa. This step can only decrease Ai and hence 7 ' < 7 . 

After this step, one can assume that A^ < a k for all k, i. Now suppose there are some constraints 
with a k < 1. In this case, replace row A k with A' k = A k /a k and replace a k with a' k = 1. Because 
of our assumption that A k i < a k for all k,i, the new row of the matrix still satisfies A' k € [0, l] n . 
This step ensures that a' k > 1 for all k. Also, every column in the matrix is scaled up by at most 
l/a k A 1/omiri) so we have A' x < Ai /a m \„ and a' min = 1. We then have 

i = ln(A' + l)/a' min = ln(Ai/a min + 1) < ln ( Al + ^ = 7 . 

^min 

Finally, suppose that Ai < 1. In this case, observe that we must have A k j < Ai for all k , i. Thus, we 
can scale up both A, a by 1/Ai to obtain A! = A/Ai, a! = 0 /A 1 . This gives A' = 1, = a m i n /Ai 

, hi(l + 1) ln(Ai + 1) 

7 =-7T- < -= 7 

^min /^1 ^min 


□ 

Corollary 4.3. Let 7 = There is an algorithm running in expected polynomial time to 

&min 

obtain a solution x € Z™ which satisfies the covering constraints and which satisfies 

C ■ x < (l + 7 + 0( v / 7')) OPT 
where OPT is the optimal integral solution to the original CIP. 


Proof. First, apply Theorem l4.2l to ensure that Ai > 1, a m ; n > 1; the resulting CIP has a parameter 
7 ' = lll i z ^i+ 1 ) < 7 _ Next, consider the corresponding basic LP, in which all multiplicity constraints 

a min 

are ignored. Let us denote this LP by Z and let 2 € [0, oo)” be an optimal solution to it, of value 
Z = C ■ z. Clearly Z < OPT since Z is a relaxation (in two separate ways — Z ignores the 
integrality constraints as well as the multiplicity constraints.) 

Now suppose we apply Theorem 14.11 and let us denote the solution we obtain (which is a random 
variable) by x € Z” . This solution x satisfies E[C - a;] < (1 + 7 ' + 4 \fiY)Z < (1 + 7 + 4 ^/ 7 )Z. Also, 
since x satisfies all the covering constraints, then x is also a solution to the linear program Z\ this 
implies that C ■ x > Z with probability one. 


By applying Markov’s inequality to the non-negative random variable C ■ x 


P{C ■ X > (1 + 7 + 5y/j)Z) < 


7 + jV7 

7+ 5^/7 


Z. we see that 


This is an increasing function of 7 , and 7 < ln(m + 1 ). Simple calculus shows that 

P(C-x>( 1 + 7 + 5^7 )Z) < 1 - 0(-=L=) 

V log m 


Thus, after repeating this process for 0(\J log m) iterations (in expectation), we achieve an integral 
solution which satisfies the covering constraints and which satisfies C ■ x < (1 + 7 + hy/fi)Z < 
(l +7 + 0(^))OPT. □ 
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5 Respecting multiplicity constraints 


Theorem 14. II may considerably violate the multiplicity constraints. We will describe two algorithms 
to better satisfy these constraints: the first ensures the multiplicity constraints are approximately 
preserved, and gives an approximation ratio in terms of the basic LP. The second preserves the 
multiplicity constraints exactly, but gives an approximation factor only in terms of the £q norm of 
the constraint matrix and the optimal integral solution. 

Theorem 5 . 1 . Suppose we have a CIP with Ai >,o m j n > 1 , and a solution x to its basic LP. Let 
= ln(A 1 + l) _ 

& min 

Let e € [0,1] be given. Then, with an appropriate choice of a, a we may run the ROUNDING 
algorithm to obtain a solution x E Z" satisfying 

Xi < \&i{\ + e)] with probability one 
E [xi] < Xi(l + 4^/7 + 47/e) 

The expected run-time is 0(mn). 


Proof. First, suppose 7 < e 2 /2. In this case, apply Theorem l4.ll this ensures that x,; < r^ Hn(i+yy) ~l 
and some simple analysis shows that this is at most |~Xj(l + e)~|. We then have E[xj] < l + 4 y/ 7+7 < 
1 + 4^/7 + 47/6 as desired. 

Next, suppose 7 > e 2 /2. Set a = ( 1 + £ P n ( 1 ~°') ; -where a £ ( 0 ,1) is a parameter to be determined. 

Then by Proposition 13.31 we have Xi < \x{( 1 + e)] at the end of the ROUNDING algorithm. 


We clearly have a > — ln ^ ^ and so by Theorem 13.51 

A-ki 

(1 _ (j^a ke aaa k _ l 


E [Xi] < axAl + (7 


< axi (1 + <7 


Ai 


(1 - a) 


- 1 


Now set a = 1 — e 7 / e ; observe that this is indeed in the range (0,1). This ensures that (1 — 
( 7 )- a min e = Ai + 1 and hence 

E [xi] < Xia( 1 + cr) = Xi[e~ l {2 + ^ 77 —j-)(l + e)?) 

Simple calculus shows that e -1 (2 + _r )(1 + e)y < 1 + e + (2 + 2/e) r ). By our assumption that 

e E [0,1] and our assumption that e 2 /2 < 7 , this is at most 1 + y/2rj + 47 /e as desired. 

The bound on the running time follows the same lines as Theorem 14.11 □ 

Next, we show how to exactly preserve multiplicity constraints. We follow here the approach of 
DEO, which in turn builds on an approach of [ 2 |: they construct a stronger linear program via 
the knapsack-cover (KC) inequalities. This LP has exponential size, but can be approximately 
optimized in polynomial time. We then round the resulting solution using Theorem 14.11 Although 
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this algorithm is discussed in great detail in m and [ 2 j, we give a self-contained presentation here 
to £11 in a few technical details. 

The key to the KC inequalities is to form a residual problem, given that a set of variables X is 
“pinned” to their maximal values. 

Definition 5.2 (The pinned-residual problem). Suppose we have a CIP problem, with constraint 
matrix A, RHS vector a, and midtiplicity constraints d. Given any X C [n], we define the pinned- 
residual, denoted PR(X), to be a new CIP problem A',a',d which we obtain as follows. 


1. For each k = 0, ..., m, let Vk = a k — z2iex Akidi, and set 


2. For each k, i set: 

A-ki 



Vk 

if v k > 

1 




II 

53 

1 

if v k G 

(0, 

1] 




0 

V. 

VI 

Ad 

0 




0 


if i G 

X 




0 


ifi £ 

X, 

Vk 

< 

0 

min(l 

A-ki 

Vk 

if i i 

X, 

Vk 

€ 

[0,1] 

Aki 


ifi i 

X, 

v k 

> 

1 


Observe that if any constraint has a' k = 0, then it has effectively disappeared. Also, observe that 
for i € X, the constraint matrix A' does not involve variable Xi (the column corresponding to i is 
zero). Hence, we may assume that any solution x to PR(X) has Xi = 0 for i G X. 

Proposition 5.3 (|13|.[2]). For any X C [n], the following hold: 


1. Any integral solution to the original CIP A,a,d also satisfies PR(X). 

2. PR(X) has a' min > 1, A} < Ao, where Ao is the maximum I^-column norm of A. 

Theorem 5.4. Given any CIP A, a, d, C, there is an algorithm which runs in expected polyno¬ 
mial time and returns a solution x € Z” satisfying the covering and multiplicity constraints, and 
satisfying 

Cx<( 1 + In A 0 + O(v'logAo)) OPT, 
where OPT is the optimal integral solution. 

Proof. Let y 0 = ln(A 0 + 1) and let 5 = ■ 

We begin by finding a fractional solution x which minimizes C ■ x, subject to the conditions that 
Xi G [0, di\ and such that x satisfies PR({i | X{ > di/5}). This can be done using the ellipsoid 
method: given some putative x, one can form PR({i | Xi > di/8}) and determine which constraint 
in it, if any, is violated. (See [13j for more details.) 


23 






Suppose we are given some optimal LP solution x satisfying this condition. By Proposition 15.31 
any optimal integral solution satisfies PR(E) for all Y C [n], and in particular is a solution to the 
given LP. Thus, C ■ x < OPT. 

Let X = {i | Xi > di/6}. Set Xi = di for i € X. For i ^ X, we run the algorithm of Theorem 14.11 
on PR(X) to obtain a random solution xi. 

For i G X, we clearly have x t < di. Observe that by Proposition 15.31 PR(X) has 7 ' < 70 . So 
for i X, we have Xi < \6xi\\ this is at most [ ~d./\ = di by definition of X. So x satisfies the 
multiplicity constraints. 

Next, for ielwe clearly have E[a;j] < di < XiS < Xi( 70 + 0(1))- Also, for i ^ X, we have 

E[xj] < Xi( 1 + 7 ' + 4V7 7 ); 
by Proposition 15.31 this is < Xi{\ + 70 + 

Thus, 

E [C ■ x] < (1 + 70 + c-^/70) OPT 


On the other hand, since x satisfies the covering constraints and multiplicity constraints, we have 
C ■ x > OPT with probability one. By Markov’s inequality applied to the non-negative random 
variable C ■ x — OPT, 


P(C • x > (1 + 70 + 2 c v / 7 tj)OPT) < 7o + c V^_ 

70 + 2cyYfo 


< 1 - 01 - 


<r 1 


Thus, after repeating this process for 0(^/logm) iterations (in expectation), we achieve a solution 
satisfying 


C ■ x < (1 + 70 + 2 c- v /to)OPT < (1 + In A 0 + 0(^ logA 0 ))OPT 


□ 


6 Lower bounds on approximation ratios 


In this section, we provide lower bounds on the approximation ratios of CIP algorithms. These 
bounds fall into two categories, namely, inapproximability of CIP (which follows from inapprox- 
imability of set cover), and integrality gaps for the basic LP. The formal statements of these results 
contain numerous qualifiers and technical conditions. So, we will summarize our results informally 
here: 


1. Under the hypothesis P 7 ^ NP, then any polynomial-time algorithm to solve the CIP while 
ignoring multiplicity constraints, which gives an approximation ratio parametrized as a func¬ 
tion of 7 , cannot have an approximation ratio of the form (1 — a) In 7 , where a > 0 is any 
constant. 
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2. Under the Exponential Time Hypothesis (ETH), hen any polynomial-time algorithm to 
solve the CIP while ignoring multiplicity constraints, which gives an approximation ratio 
parametrized as a function of 7 , cannot have an approximation ratio of the form In 7 —C In In 7 , 
where C is some specific universal constant. 

3. When 7 is large, then the gap between solutions to the basic LP, and integral solutions which 
e-respect the multiplicity constraints, can be as large as Slfy/e). By contrast, the algorithm 
of Theorem o achieves approximation ratio 0 ( 7 /e). 

4. When 7 is small, the integrality gap of the basic LP is 1 + 12 ( 7 ); by contrast, the algorithm 
of Theorem 14.11 achieves approximation ratio 1 + 0(y/ 7 ). 


We note that one might wish to formulate an approximation ratio in terms of many possible param¬ 
eters; two natural ones are A 1 , a m ; n but there are numerous others including Ao ,n,m, etc. Lower 
bounds for the approximation ratio are very difficult to state in the context of these multiparametric 
approximations. Thus, it is quite possible that one is able to show approximation ratios which are 
incomparable to ours, perhaps even ones which are stronger for most natural problem instances. 
However, approximation ratios which are functions of 7 = -^- L - cannot be significantly improved. 

^min 

(It is possible that there is an alternate and stronger functional form, which depends on Ai,a m ; n 
in a more complicated way than their ratio.) 


6.1 Hardness results 

Set Cover is a well-studied special case of CIP; many of the hardness results for Set Cover thus 
automatically imply corresponding hardness results for CIP. These hardness results are all based on 
a construction of Feige f7j, which was later strengthened by Moshkovitz m- These results showed 
that assuming P A NP, then for any constant a > 0, set cover on a domain of size n cann be 
approximated within a factor of (1 — a) Inn in polynomial time. Dinur &; Steurer [4] showed that 
under the Exponential Time Hypothesis, then set cover cannot be approximated to within a factor 
of Inn — C In Inn, where C is a universal constant. 

Proposition 6.1. Suppose that there is a function f : [0, 00 ) —>• [0, 00 ) and a polynomial-time algo¬ 
rithm A to approximate CIP, such that A is guaranteed to achieve approximation ratio /( ln ^ 1+1 ^ ). 

U'min 

Then: 

1. Assuming P A NP, there cannot be any a > 0 such that f(x) < (1 — a)x for all sufficiently 
large x. 

2. Assuming ETH, one cannot have f(x) < x — C lnx for all sufficiently large x, where C is 
some universal constant. 


Similarly, suppose that there is a function f : [ 0 , 00 ) —>• [ 0 , 00 ) and a polynomial-time algorithm A 
to approximate CIP, such that A is guaranteed to achieve approximation ratio /(ln(Ao + l)). Then: 


1. Assuming P A NP, there cannot be any a > 0 such that f{x) < (1 — a)x for all sufficiently 
large x 
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2. Assuming ETH, one cannot have f(x) < x — C lnx for all sufficiently large x, where C is 
some universal constant. 


Proof. Set Cover instances on a domain of size n can be encoded as CIP instances with Ao = Ai < 
n — 1 and a m i n = 1. Namely, for each item i € [n], we construct a constraint Ylj\ieS- x i — 1> where 
Xj is an indicator variable for the set Sj appearing in the cover. The £q and -C -column norms 
corresponding to a variable Xj are both \Sj\. We may assume that none of the sets Sj is equal to 
[n], and so Ai, Ao are both at most n — 1 . □ 

In particular, for 7 (respectively Ao) large, the approximation ratio guarantees of Theorem 14.11 
(respectively Theorem 15.4|h are optimal up to first-order. 


6.2 Integrality gaps for the regime in which 7 —» 00 

We next show a variety of integrality gaps for the basic LP. These constructions work as follows: 
we give a CIP instance, as well as an upper bound on the weight of the fractional solution T for the 
basic LP and a lower bound on the weight of any integral solution T. This implies that algorithm 
which start from the basic LP solution must cause the weight to increase by at least T/T. 

In this section, we show integrality gaps matching Theorems 14.11 and 15.11 when 7 is large. 

Proposition 6.2. Let a > l,m > 1 be given. There is a CIP program with m covering constraints 
and no multiplicity constraints which satisfies the following properties: 


1 . 

2 . 

3. 


All the RHS values are equal to a common value a. 

The entries of the constraint matrix A are in {0,1}. 

Let T be the optimal value of this basic LP, and let T be the optimal value of the covering 
program itself. 


~ In m log log m 
1 /1 > -c- > 7 


clog 7 


for some universal constant c > 0 . 


Proof. First, we claim that we can assume that m is larger than any desired constant. For, suppose 
m < mo- Then, for some constant c > 0, we have lnm — clog log m < 1 for all m < mo- We 
certainly have T/T > 1, so we have T/T > In m — clog log m > llim ~ cl a ° glos "' . Likewise, we can 
assume that In rn > a. We will make both of these simplifications for the remainder of the proof. 

We will form the rn constraints randomly as follows: we select exactly s positions i \,..., i s uniformly 
at random in [s] without replacement, where s = \pn \ ; here n 00 and p 0 as functions of m. 
We then set A^ = ■ ■ ■ = Aki s = 1; all other entries of Aj. are set to zero. The RHS vector is always 
equal to a. The objective function C is defined by C ■ x = fff xp, that is, each variable is assigned 
weight one. 
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We can form a fractional solution x by setting Xi = a/s. As each constraint contains exactly s 
entries with coefficient one, this satisfies all the covering constraints. Thus, the optimal fractional 
solution has value T < na/s = a/p. 


Now suppose we fix some integral solution of weight Xi = t. Let I C [n] denote the support of 
x, that is, the values i € [n] such that Xi > 0; we have |/| = r < t. In each constraint k, there 
is a probability of ( n-r )/( n ) that = 0 for all i € /. If this occurs, then certaintly A ■ x = 0 
and the covering constraint is violated. Thus, the probability that x satisfies constraint k is at 


most 1 — 


(V) 


As all the constraints are independent, the total probability that x satisfies all m 


constraints is at most: 


P{x satisfies all constraints and has weight f) < (1 — 


(V) 

0 


(”-*) 

< exp(-myig) 


< exp(— m( 


C) 

n-s- (t- 1 ) t 


n 


< exp(— m(l — p — t/nf) as s < pn + 1 


We want to ensure that there are no good integral solutions. To upper-bound the probability that 
there exists such a good x, we take a union-bound over all integral x. In fact, our estimate only 
depended on specifying the support of x, not the values it takes on there, so we only need to take a 
union bound over all subsets of [n\ of cardinality < t. There are at most (") — nt such sets, 

and thus we have 

P(Some x satisfies all constraints) < n* exp(— m(l — p — t/nf) 

< exp(tInn — m(l — pf + mt 2 jn) 

We now set n = mt, and obtain 

P(Some x satisfies all constraints) < exp(t(l + ln(mt)) — rn(l — pf) 

< exp(t 2 In m — mexp(—pt — p 2 t)) for m,p,t sufficiently small 


If this expression is smaller than one, then that implies that there is a positive probability that no 
integral solution exists. Hence, we can ensure that all integral solutions satisfy T > t. Now, some 
simple analysis shows that this expression is < 1 when p = 1 /lnm and t = p -1 (lnm — 10 In In m) 
and m sufficiently large. Thus we have 


T/f > 
> 


p 1 (In m — 10 In In rn) 
a/p 

In m — 0 (log log m ) 
a 


> 111(0 + 11 - O(log( log(0+1) )) 


as we have claimed. 


□ 
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This argument can be adjusted to take into account a (1+e) violation of the multiplicity constraints. 

Proposition 6.3. Let a,m be given integer parameters and let e € (0,1). Then there is a CIP 
instance on m constraints which share a common RHS value a and a parameter d > 0 such 
that the fractional solution x € [0, d] n has objective value T, the optimal integral solution in 
x E {0,1,..., [(1 + e)d |} n has objective value T, and 

- In m — c In In m 

T T > - 

ae 

for some universal constant c > 0 . 

Hence, the basic LP solution cannot be rounded to within o(^) while e-respecting multiplicity. 


Proof. Let A be the CIP instance constructed of Proposition 16.21 in n variables and m. constraints 
and with RHS value equal to one. By construction, it satisfies T/T > lnm — clnlnm for some 
constant c > 0 . 


We form a new CIP instance A' on n + m variables and m constraints; for each constraint k = 
1 ,..., m we set 

n 

ji[(\ + e ) + i Xm + k + Aki x i > a 

and we have an objective function C ■ x = Ya= i that is, each variable x\,... ,x n has weight 
one, and each variable x n+ \, ..., x n+m has weight zero. We set d, = oc for i = 1,... ,n and we set 
dj = K for i = m + 1,..., m + n; here K is a large integer parameter, which we will specify shortly. 
(In particular, for K sufficiently large, all the coefficients in this constraint are in the range [0,1].) 


The resulting CIP instance contains m constraints and a m i n = a. Now suppose that x is a fractional 
solution to the original CIP instance. Then let v = anc ^ cons ider the fractional solution 

x' defined by 


x 


/ 

i 


vxi if i < n 

K if n + 1 < i < n + m 


Observe that for any constraint k we have 

n 

-Xm+k E Akix'i ~- 


K( l + e) + l‘ 


i —1 


> 


K{ 1 + e) + 1 
a 

K{ 1 + e) + 1 


K + vY] A ki £i 


2=1 


K + v as A k ■ x > a k = 1 


= a 


and so this is a valid LP solution. Thus the fractional objective value is at most T' < YYi =i vx \ = y T- 

On the other hand, consider an integral solution x'. As x m+k < [(1 + e)K~\, we have that for all 
k € [m]: 

n 

K (l + e) + 1 (1 + e)K + S AkiXi ~ a 

' i =1 
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which implies that Y^=i AkiX% > 0 . 


As all the entries of A^ are in {0,1}, this implies that Y^i=\-^-ki x i > h and so x is an integral 
solution to the original CIP instance. Thus, its objective value is at least T' > T, where T is the 
optimal integral solution to the original A. 


Thus we have that 


T' T In m — c In In m 

— > — > - 

V vT v 


Taking the limit as K —>• oo, we see that for any <5 > 0 there exists a CIP with integrality gap 

T' (1 + e)(lnm — clnlnm) 

— >- o 

T' ae 


In particular, as e > 0, we can select 5 sufficiently small so that 

T' In m — c In In m 

— > - 

T> ae 

□ 


In light of this result, we note that Theorem 15.11 has an optimal approximation ratio in terms 
of 7 , e for 7 —» 00 , up to a constant factor. However, this integrality gap construction does not 
apply to Theorem 15.41 which uses a stronger LP formulation (the KC constraints). For this reason, 
Theorem 15.41 is able to achieve an approximation ratio which remains bounded as e —>• 0. 


6.3 Integrality gaps for the regime 7^0 

We next show an integrality gap for the case of small 7 . To our knowledge, this is the first non¬ 
trivial hardness result in this regime; previous works show, for instance, integrality gaps of the form 
14 ( 7 ), which is of course vacuous when 7 ~ 0 . 

This integrality gap does not match Theorem 14.11 precisely; here, we obtain an integrality gap of 
order 1 + 14 ( 7 ) while Theorem 14.II gives the weaker approximation ratio 1 + 0 ( 7 ). Our construction 
here is based on an integrality gap of m for set cover, which we extend to CIP by allowing large 
RHS values. 

Proposition 6.4. For any g E (0,1) and m > 2 l+1A ^ 9 , there is a CIP with m covering constraints 
and no multiplicity constraints, all the RHS values equal to a common value a where < g, and 
which satisfies also the following integrality gap property: Let T be the optimal value of the basic 
LP and let T be the optimal integral value. Then 

T/T > 1 + g/8 > 1 + 14(7) 


In particular, it is impossible to guarantee an approximation ratio of the form 1 + o( logm ) as a 

C^min 

function of the basic LP solution. 
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n as corre- 


Proof. We set n = 2 q — 1 where q = [log 2 m \ • We will view the integers from 1,..., 
sponding to the non-zero binary strings of length q. Thus, if i, i' € {1,..., s}, then we write i ■ i' to 
denote the binary dot-product. Namely if we have i = io + 2i\ + 4*2 +... and i' = i' 0 + 2i[ + 4 i' 2 +... 
where ij,ij € {0, 1 }, then we define i ■ i' = 0 /JTq iii[. 

The covering system is defined as follows: For each k £ {1,..., n} we have a constraint 

E Xi > a where a = — - 

q 

i:(k-i )=0 


The objective function is C ■ x = x i- This has n < m constraints. Observe that we have 

m > 2 1+14 / ff > 91.7, and so log 2 m — 2 > lnm and hence we have < g. 

We form the fractional solution x by setting Xi = for i = 1,..., n. This shows that the optimal 
fractional solution has value T < < 2a. 

Now consider some integral solution x € Z" with Xi = T. We can write x as a sum of basis 
vectors, x = e yi + • • • + e yT , where y\, ..., yx are not necessarily distinct. Consider the quantity 

V = Y^ H [k-y h = ■■■ = k- y lq _ 1 = 0] 
k l<ii<--<i q -i<T 


where [] is the Iverson notation (which is one if k ■ y^ = • • • = k ■ yi q _ 1 = 0 and zero otherwise). 


We count V in two different ways. First, for any i \,..., iq—i, by linear algebra over GF(2 ) there 
must exist at least one k p 0 which is orthogonal to all y n ,... ,yt q _ l ■ Hence we have V > ( J. 

Second, for any k, there are at most T — a choices of yi which are orthogonal to k. Thus we have 
V < ( 2 9 — 1 )(^“). We have shown a lower bound on V and an upper bound on V. The lower 
bound on V must be at most the upper bound on V, or otherwise we would have a contradiction. 
Thus, a necessary condition for x to satisfy the covering constraints is that 



(^r)( 29 -i) 


< i 


( 4 ) 


We claim that dH) implies that T > (q — 1)(2 /g + 1/4). As the LHS of dU) is a decreasing function 
of T, it suffices to show that (JH) is violated for T = (q — l)(2/g + 1/4). Rearranging some terms 
and recalling that a = (q — !)/<?, we see that it suffices to show that 


((«—l)(2/ff+l/4)^ 

((g- 1 )( 1 /9+ 1 / 4 )p2 < ? — 1) 


> 1 


( 5 ) 


We use the bounds 2 q — 1 < 2 q and the bound on the factorial \/2TTr r+ ^e r < r\ < er T+ 2 e r , to 
obtain the following condition, which implies (|5|): 


— 3gq+5g+4q—4 3gq—5g—8q+8 (g+4)q+g—4 (g+8)q+g—8 fp 1 

2 _ 9 (4-3 g) 4 » ( 8 - 30 ) ^ (g + 4) ^ {g + 8) ^ > — 


( 6 ) 
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We can increase the RHS of 0 slightly to e to simplify the calculations, and take the logarithm 
to solve for q. This gives us the following condition, which implies (0: 

29 (-2 + ln(4 - 3 g) - ln(8 - 3 g) - ln(g + 4) + ln(g + 8) - 2 In 2) 

q > 1 -|-(7) 

49 In 2 - (4 - 39 ) ln(4 - 39) + (8 - 3 g) ln(8 - 3 g) + (4 + 9) ln(4 + 9) - (8 + 9) ln(8 + g) 


The RHS of 0 is a function of g alone. Simple but tedious analysis (see Proposition I A. oil shows 
that it is at most 14 /g. 


But note that q = |_log 2 rn\ > log 2 m — 1: thus, our bound on the size of rn guarantees that indeed 
q > 14/g. So([7])=>0=>0=>T>((/ — 1)(2 /g + 1/4). The integrality gap is then given by 


T/f> (q - i)( 2 /g +1/4) 
' - 2 a 


2 a + ag/4 
2 a 


1 + 5/8 


□ 


7 Multi-criteria Programs 


One extension of the covering integer program framework is the presence of multiple linear ob¬ 
jectives. Suppose now that instead of a single linear objective, we have multiple objectives C\ ■ 
x ,..., C r ■ x. We also may have some over-all objective function D defined by the following: 

D(x 1 ,... ,x n ) = D(Ci ■ x,... ,C r ■ x) 


For example, we might have D = max/ Ci ■ x or we might have D = ^ ~2i(Ci ■ x) 2 . 

We note that the greedy algorithm, which is powerful for set cover, is not obviously useful in this 
case. However, depending on the precise form of the function D, it may be possible to solve the 
fractional relaxation to optimality. For example, if D = max/ C/ • x, then this amounts to a linear 
program of the form mint subject to Ci ■ x < t. 

For our purposes, the algorithm used to solve the fractional relaxation is not relevant. Suppose 
we are given some solution x. We now want to find a solution x such that we have simultaneously 
Ci ■ x ~ C( ■ x for all i. Showing bounds on the expectations alone is not sufficient — it might be 
the case that E[C/ ■ x] < /3Ce ■ x, but the random variables C\ ■ x ,..., C r ■ x are negatively correlated. 

In 119] . Srinivasan gave a construction which provided this type of simultaneous approximation 
guarantee. This algorithm was based on randomized rounding, which succeeded only with an 
exponentially small probability. Srinivasan also gave a derandomization of this process, leading to 
a somewhat efficient algorithm. This derandomization had a somewhat worsened approximation 
ratio compared to the single-criterion setting, roughly of the order 0(1 + log [ t Ao+1 ^ ), and running 
time of 0(n logr ). In particular, this was only polynomial if r was constant. 

In this section, we will show that at the end of the ROUNDING algorithm, the values of Ci ■ x are 
concentrated around their means. This will establish that there is a good probability that we have 
Ci ■ x ~ E [Ci ■ x] for all £ = 1,..., r. Thus, our algorithm automatically gives good approximation 
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ratios for multi-criteria problems; the ratios are essentially the same as for the single-criterion 
setting, and there is no extra computational burden. 

We begin by showing that the values of x produced by the RELAXATION algorithm obey a type 
of negative correlation property. We will show this via a type of “witness” construction, similar to 
Lemma EH however, instead of providing a witness for the event that Xi = 1, we will provide a 
witness for the event that simultaneously = ■ ■ ■ = = 1. 

This proof is based on induction similar to Lemma 12.11 Suppose we are given any set I C [n], any 
integers Ji,..., J m > 0, and an array of sets Z = (Zjg. \ k = 1,..., m, j = 1,..., }. We then 

define the event £(I, J, Z) to be the following: 


1. For each k = 1,..., m, the first J & resampled sets for constraint k are respectively Z^ i,..., Z^ j k 

2. Each i € I turns at 0 or some ( k,j ) where 1 < j < J 


We similarly define the event £{I, J, Z, v) for any v € {0, l} n to be that the event £(I, J, Z) occurs, 
if we start the RELAXATION algorithm by setting x = v (instead of drawing x as independent 
Bernoulli-p,), and the event £{T, I, J, Z, v ) to be the event that £(I, J, Z, v ) occurs and the relax¬ 
ation algorithm terminates in less than T resamplings. 

Given any integers Ji,. .., </*,, we define prefix(J) to be the set of all pairs (k,j) where 1 < j < J 
Proposition 7.1. Suppose that Xi € [0,1 /a) n . Let v € {0,1 } n ,I C [n], and J, Z be given. 

Then 

P(£(I,J,Z))< n ^ n fk{Z k ,j ) 

iE/ (kJ)^prefix(J) 


Proof. Define 

D = Zj } k 

(k j)Sprefix(J) 


We prove by induction on T that for any T > 0 we have 


p(e(t,i,j,z,v))< n prefix(j) fk \ Zk,j> 

ieinD Il*eDU Pi) 


A few details of the proof which are identical to Lemma 12.11 are omitted for clarity. 

Let k be minimal such that ■ x < a^. If J/ > 1 for any l < k then the event £{T, I , J, Z,v) 
is impossible and we are done. If J = 0, then £(T, /, J, Z, v ) is equivalent to £(T — 1,1, J, Z, x') 
where x' is the value of the variables after a resampling; for this we use the induction hypothesis 
and we are done. 

So suppose Jy. > J. Define D' = U(jJ)eprefix(j) Zj,i- Then the following are necessary events to have 
£(T — 1,1, J, Z, x'): 
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(Al) We select Z ky 1 as the resampled set for constraint k 

(A2) The event £(T — 1, I 7 , J', Z' , x') occurs, where x' is the value of the variables after resampling, 
where I' = / n D' , and J' , Z' are derived by setting J' k = J k — 1 and by Z' k v ..., Z' k j ^ = 
Z k 2 i..., Z' k j k (and all other entries remain the same) 

(A3) For any i G (Z^ — D') n I we resample Xj = 1 

(A4) For any i € Z^i n D' we resample X{ = 0 


The rationale for (A3) is that we require i € I to turn at some (j. 1) G prefix(J), and in addition 
Zjj is the j th resampled set for constraint l. This would imply that i G Z^ j. However, there is only 
one such (j,l), namely (j, 1) = (1 ,k). Thus, we are requiring i to become resampled to xi = 1. 

The rationale for (A4) is the same as in Lemma 12.11 if we resample x\ = 1, then Xj can never be 
resampled again. In particular, we cannot have i in any future resampled set. Thus if x[ = 1 but 
i G Z k .i n D' , then the event (A2) is impossible. 

As in Lemma 12.11 the event (Al) has probability < (1 — a) a,: DTie[n] (~ Wi^z k , a • 

Event (A3), conditional on (Al), has probability Tlie(Z k x-D^niPi- 
Event (A4), conditional on (Al), (A3), has probability Tiiez k inD' 1 ~ Pi- 
By induction hypothesis, event (A2), conditional on (Al), (A3), (A4), has probability 

p((A 2 ))< n Pixjja-^x n h(z k j) 

iGl'—D' i£D' L,Z)gprefix( J') 


Multiplying these probabilities, after some rearrangement, gives us the desired bound on P(£(T , /, J, Z, v)), 
thus completing the induction. 


Next, as in Lemma 12.11 we immediately obtain also 

P(S(I, J, Z, V)) = lim P(S(T, /, J, Z, V)) < n Pi X 

T ^°° ieino UieD^-Pi) 


Finally, to obtain a bound on P(£(I, J, Z)), we observe that if i G D, then Xi must be equal to 
zero during the initial sampling. Also, if i G I — D, then Xi must be equal to to one during the 
initial sampling. This has probability Yiiel-DPi ~ Pi)- Conditional on this event, we have 

P(£(I,J,Z,x)) < UieimPi x . Thus, multiplying the probabilities together, 

gives us 

P(,£(!> J-i z)) < n ^ n fk(Zkj) 

i£l (A; prefix (J) 


as desired. 


□ 
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Proposition 7.2. Let R C [n]. Suppose that at the end of the RELAXATION algorithm we have 
Xi = 1 for all i € R. 

Then there is a set R' C R and an injective function h : R' —>• [m], as well as non-negative integers 
Ji,... ,J m and sets Z k .j for j = 1 , ... ,J k , which satisfy the following properties: 

(Dl) For all i € R' we have J^i) > 1 and i E 
(D2) For all k ^ h(R') we have Jk = 0 

(D3) Each i E R turns at either 0 or at some (k,j) for k < J k 

Proof. Let 5o C denote the set of variables i E R which turn at 0. For each k = 1 , ... ,m let 

S k CR denote the variables i E R which turn at constraint k, where each i E S k turns at (k , Li). 

Observe that Sq, Si,... , S m form a partition of R. 

Now for each k = 1,..., m we define: 

Jk = max Li 


We form the set R' by selecting, for each k E [m] with S k 7 ^ 0, exactly one * € S k with Li = J k 
(there may be more than one; in which case we select i arbitrarily). We define / by mapping this 
i E Sk to k. 


Note that we must have i E as we are assuming that Li = J k where k = h(i). 

Also, each i E S k must turn at ( k , Li) and Li < J k , thus (D3) is satisfied. 


□ 

Theorem 7.3. Suppose x E [0,1 /a) n and a > — !hlL_2i_ p or an y R c [n], the probability that 
Xi = 1 for all i E R is at most 

p( a xi = !) - n (h 

ieR i&R 

where, for each i E [n], we define 

p x — ax l (l + (1 _ ujfflfc e aaA k -x _ 1 ) 


Proof. By Proposition l7.2l there must exist R', h, Z ki j, J satisfying (Dl), (D2), (D3). By Lemma mi 

for any Z, J satisfying (Dl), (D2) the probability of satisfying (D3) is at most ][[[ ieRPi n (fcjjeprefix( J) fk(Zkj). 

Taking a union bound over all such J, Z k j we have: 

p(f\ Xl =i)< Yi lift n (s) 

i£R R',h,Z,J i£R (k ,j) G prefix (J) 

satisfying (Dl), (D2) 


We must enumerate over all R', h, Z, J satisfying (Dl), (D2). Suppose now that R' and h are fixed. 
To simplify the notation, let us suppose wig that R! = {1,... ,r}. We now consider the following 
process to enumerate over Z , J: 
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1. We select any vector of integers J' £ Z^_, and sets Z[ ■ where j < J[. 

2. For each i £ R', we select a set IF,; C [to] with i € W*. 

3. We define J by = J- + 1 for i = 1,..., r, and all other value of J are equal to zero. Also 
for j = 1, •.., J' we set Z h{i)J = ZR and finally Z h{i ^j, +1 = W t . 

Now observe that for a fixed R', h this process enumerates every Z , J satisfying (Dl), (D2) exactly 
once. Furthermore, for any J',Z',W, we have 

J' 

T J k 

n fk( z kj)= E 

(fcj)Sprefix( J) Wi3l,...,W r 3r ieR’ i=lj=l 


Thus, summing over possible values for Z', J 1 , W we have: 


e n A(^)-n( e Aw(Wi)E e fi a( z »«),i 

i=l lVC[n],W3i i , >0Z k(j)il ,..,4 (i)i3 ,i=l 


Z,J (fcj')Sprefix( J) 

satisfying (D1),(D2) 


< n ShtyAbtyjO- x E ( s ft(i)) J by Propositions E21 [O] 

i=l j’> o 

Sh(i)A-h(i),i® 


n 

ie-R' 


1 - 


^h(i) 


Thus, now we may sum over R' C R and injective h : R' -3- [m] as: 


E lift II fk(Z k j)<Y[Pi 

ieR 


R',h,Z,J ieR (k,j)epre&x(J) 
satisfying (Dl), (D2) 


E IH 

R'CR ieR' 

injective h : R' [m] 

^h(i)-^h(i),i^ 


s h(i) 


*iift e n 

ieR R'CR ieR' 
h:R' —>[m] 

m 

-n»EiiE 

ieR R'CR ieR 1 k=l 
m 


i - 


$h(i) 


SkA/eiCT 
1 - Sfc 


n Pl( 1 + E^)=n« 


ieR 


k=l 




ieR 


□ 


We can now show a concentration phenomenon for C ■ x. In order to obtain the simplest such 
bounds, we can make an assumption that the entries of C are in the range [0,1]. In this case, we 
can use the Chernoff upper-tail function to give estimates for the concentration of C ■ x. 
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Definition 7.4 (The Chernoff upper-tail). For t > p with 6 = 6(p,t) = t/p — 1 > 0, the Chernoff 
upper-tail bound is defined as 

Chernoff- U(p, t) = ( (9) 

That is to say Chernoff-U(p,t) is the Chernoff bound that a sum of [0,1 ]-bounded and independent 
random variables with mean p will be above t. 

Corollary 7.5. Suppose that all entries of Ci are in the interval [0,1] and that x € [0, l/a] n 
and a > —!£i_L_2l_ Then, after running the RELAXATION algorithm, the probability of the event 
Ci ■ x > t is at most Chernoff- U(Ci ■ p,t). 

Proof. The value of Cj ■ x is a sum of random variables CuXi which are in the range [0,1]. These 
random variables obey a negative-correlation property as shown in Theorem 17.31 This implies that 
they obey the same upper-tail Chernoff bounds as would a sum of random variables Xj which are 
independent and satisfy E[W] = pi- □ 


We next need to show concentration for the ROUNDING algorithm. 

Theorem 7.6. Suppose that all entries of Ci are in [0,1]. Then, after the ROUNDING algorithm, 
the probability of the event Ci ■ x > t is at most Chernoff- U(Ci ■ p,t). 


Proof. Let Uj, Gi, x\, a' k , x' be the variables which occur during the ROUNDING algorithm. We 
have 

P{Ci -x>t) = P{Ci • ( V0 + G + x') > t) 

= P{Ci -x 1 >t-C r (vd + G)) 

< Chernoff-U (C) ■ p',t-C r (v6 + G)) 

Thus, we have 

P{Ci -x>t)< Chernoff-U (a £ C H xf (l + a £ — _ ) ,t-C r (v6 + G)) (10) 

i k O a ) ke k 1 


By Proposition IA.31 Chernoff-U(//, t) is always an increasing function of p. So we can show an 
upper bound for this expression by giving an upper bound for the p, term in the (HOD . We first 
apply Propositions 13.2113.41 which give: 


Xi 


1 + <T S 


A 


ki 


(1 - a) a ke (yaA k-x l - 1 


< (Xi - Vit 



where we define 


r = i + ^E 


i/cz 


(1 — (j) a ke aaak — 1 


Substituting this upper bound into (fTOl) yields: 

P{Ci ■ x > t) < Chernoff-U Cu{xi — v r f) — Gi/a)T) , t — Ci ■ (vd + G)J 

i 
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< Chernoff-U (E Cu(pi - ( Viad + Gi)),t - Ci ■ (vO + G)J 

i 

< Chernoff-U• p) - {Ci ■ {vO + G)),t - (C), • ( v0 + G))) 

< Chernoff-U (6/ • p,t) by Proposition IA.4I 

□ 


In the column-sparsity setting, we obtain the following result which extends Theorem 15.11 

Corollary 7.7. Suppose we are given a covering system as well as a fractional solution x. Let 
7 = log Q Al + 1) • Suppose that the entries of Ci are in [0,1]. Then, with an appropriate choice of a, a 
we may run the ROUNDING algorithm in expected time 0{mn ) to obtain a solution x E Z” such 
that 

P{Ci ■ x > t) < Chernoff-U{(3Ci ■ x,t ) 

for (3 = 1 + 7 + 4 ^/ 7 - 

If one wishes to ensure also that Xi < \xi{ 1 + e)] for e E (0,1), then one can obtain a similar result 
with an approximation factor [3 = 1 + + 47 /e. 
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A Some technical lemmas 


Proposition A.l. Given set S, x t € [0,1], and a € (0,1), we have 

H(1 — ax *)” 1 < (1 — a)~'^ ,i £ sXi 

ieS 
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Proof. Using the concavity of the function [~[ /e5 (l — ax i) 1 we can show that, for a fixed value of 
s = Ylics ^ attains a maximum occurs when at most one Xi is fractional. So suppose s = z + r 
where z £ Z_|_ and r £ (0,1). Then we have 

JJ (1 - ax^ 1 = (1 - a) _z (l - ra)- 1 < (1 - a)~ z ( 1 - a)~ r = (1 - 
ieS 


□ 


Proposition A.2. For any 7 > 0, define 

~ 277+7^2 ln(l + 77) 

(X + 1) T — 1 

Then f(x) is a decreasing function of x for x > 0. 


Proof. At x = 0, both numerator and denominator are equal to zero. So it suffices to show that 
the denominator grows faster than the numerator, that is, that the derivative of the denominator 
is always > 1. We compute the derivative of the denominator: 


R = 


> 


277-2111(1+77) 

(7 + 2^7 - 2 In (^7 + 1 )) (X+ 1 ) 7 


(7 + 0 )(x + 1 )“ 
7 


= 1 as desired 


7 


as y > ln(l + y) for y > 0 


□ 


Proposition A.3. For any 0 < fj, < /a' < t we have Chemoff- U(yi.t) < Chemoff-U(n',t). 


Proof. Compute the partial derivative of Chernoff-U(/x, t) with respect to /a. □ 

Proposition A.4. For any 0 < fi < t and any r < fj,, we have Chernoff-U(n,t ) < Chemoff- U(/i — 
r,t — r). 


Proof. Compute the directional derivative of Chernoff-U(/r, t) along the unit vector u = (1,1). □ 

Proposition A.5. Let f\ (g) = 2g(— 2 + ln(4 — 3 g) — ln (8 — 3 g) — In (g + 4) + ln(^ + 8 ) — 2 In 2) and 
let f 2 (g) = g In 2 - (4 - 3 g) In(4 - 3 g) + (8 - 3 g) ln (8 - 3 g) + (4 + g) ln(4 + g) - (8 + g) ln (8 + g). 
For any g £ (0,1) we have 

u/3>1+ m (11> 

Proof. Let us first consider the denominator f 2 (g)- Note that f!f{g ) is a rational function, and simple 
algebra shows that its only root is at g = —16/9. As //( 0 ) = — 1 , this implies that f'f (g) < 0 for 
all g £ (0,1). Thus, f' 2 {g) is decreasing in this range. As f! 2 ( 0) = 0, this implies that f 2 (g) < 0 for 
g £ (0,1). As / 2 (d) = 0, this further implies that f 2 (g) < 0 for g £ (0,1). 
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We may thus cross-multiply dill) , taking into account the fact that the denominator is negative. 
Thus to show m is suffices to show that h(g) < 0 , where we define 

h(g) = (—5g 2 + 46 g - 56) ln(4 - 3 g) + (5 g 2 - 50 g + 112) ln (8 - 3 g) 

+ (g 2 + 10 g + 56) In (g + 4) + (-g 2 -6 g- 112) In (.9 + 8 ) + 4 g 2 + 56 g In 2 


Simple calculus shows that h'"(g) is a rational function of g , and it has no roots in the range (0,1). 
As h!"{ 0) = —75/8, this implies that h'"(g ) < 0 for all g £ (0,1). As h"{ 0) = —0.454, this implies 
that h r, (g) < 0 for all g £ (0,1). As h'{ 0) = 0, this implies that h'(g) < 0 for all g £ (0,1). As 
h{0 ) = 0, this implies that h(g) < 0 for all g £ (0,1). □ 
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